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ABSTRACT 

Head Mounted Displays (HMDs) allow users to experience 
virtual reality with a great level of immersion. However, even 
simple physical tasks like drinking a beverage can be difficult 
and awkward while in a virtual reality experience. We explore 
mixed reality renderings that selectively incorporate the phys¬ 
ical world into the virtual world for interactions with physical 
objects. We conducted a user study comparing four rendering 
techniques that balances immersion in a virtual world with 
ease of interaction with the physical world. Finally, we dis¬ 
cuss the pros and cons of each approach, suggesting guide¬ 
lines for future rendering techniques that bring physical ob¬ 
jects into virtual reality. 
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INTRODUCTION 

HMDs are now widely available so that consumers can enjoy 
a variety of Virtual Reality (VR) experiences in their living 
rooms. While being highly immersive, HMDs occlude the 
real world making physical and social interactions difficult 
and awkward. Currently, users have two choices: keep the 
HMD on and blindly interact with the world, or take the HMD 
off and break their immersion to the virtual experience. Such 
context switching between worlds is expensive: it takes time 
to be immersed in a virtual environment [3], and frequent 
switching between worlds can be disorienting. 

In this paper, we explore a design space of rendering tech¬ 
niques that enable users wearing an HMD to interact with the 
physical environment. Our goal is to make interactions with 
the physical world more seamless, while keeping the user im¬ 
mersed in the virtual world. Unlike previous work in aug¬ 
mented reality [1, 4] that explores using stereo cameras to 



Figure 1. Performing real world, peripheral tasks while using VR 
HMDs can be (a) frustrating and messy. Our system, (b) comprised of 
two inexpensive webcams and (c) augmented virtuality renderings, al¬ 
lows users to perform peripheral tasks, such as grabbing a drink, while 
still being immersed in the virtual experience. 


superimpose virtual objects in the physical world, we overlay 
physical objects on top of a virtual environment (i.e. aug¬ 
mented virtuality [8]). Users can see their hands in the virtual 
environment, peripheral objects (e.g. a cup), or even perform 
social interactions with co-located players (see Figure 1). 

We evaluate this design space of mixed reality renderings 
with a user study comparing different renderings of varying 
visual fidelity across different virtual experiences (a movie, a 
first person shooter, and a racing game). Our results show that 
users prefer renderings which selectively blend virtual and 
physical, while maintaining a one-to-one scaling of the phys¬ 
ical environment. The highest rated rendering allows users 
to see their hands, objects of interest and salient edges of the 
surrounding environment. 

This paper’s contributions are two fold: (1) we explore a se¬ 
ries of rendering techniques that uses a stereo camera pair to 
selectively incorporate aspects of the physical environment 
into the virtual experience, and (2) we show results from a 
user study comparing the renderings and discuss the benefits 
and limitations of each approach. 










Figure 2. All the different renderings, (a) Object & Hand, (b) Object, Hand & Edges, (c) Real World Windowed, (d) Physical Picture In Picture 


RELATED WORK 

Since its inception, a large body of VR work has explored in¬ 
corporating virtual models of physical objects into virtual ex¬ 
periences. For instance, a low-fidelity model of a user’s hands 
could be captured using data input gloves [9]. Previous work 
has also explored approaches for directly merging the virtual 
world with the physical world [8]. Augmented Reality (AR) 
overlays virtual objects into the physical world, and has a rich 
history of use in mobile phones and HMDs (e.g., [2]). In con¬ 
trast, our work is aligned with Augmented Virtuality (AV), 
where virtual reality is enhanced with parts of the physical 
world, grounding the experience in the virtual world. Pre¬ 
vious work in AV has focused on collaborative applications 
including displaying real world video on virtual office win¬ 
dows [5] or displaying group communication around a virtual 
table [10]. More recent work has explored physical depth 
based renderings of a user’s hands in VR for productivity ap¬ 
plications [7]. In contrast, we focus on peripheral physical 
interactions, exploring a design space of rendering techniques 
that selectively show aspects of the physical world, reinforc¬ 
ing immersion while minimizing distraction. 

DESIGN SPACE 

We highlight a design space that focuses on tradeoffs between 
awareness of the physical world while remaining focused on 
game play (see Figure 2). We selectively identify and render 
aspects of the physical world that provide users with vary¬ 
ing amounts of information of the physical space. These 
renderings are not exhaustive and gave us an initial starting 
point for exploring this design space. The renderings are best 
understood by demonstrations (please see the accompanying 
videos). 

Renderings 

Object & Hands (OH) : The first rendering shows only the 
object of interest and the users hands. This is the minimal in¬ 
formation necessary to maintain proprioceptive feedback [9]. 
This rendering enables the user to focus on the virtual expe¬ 
rience, at the expense of limited knowledge of the physical 
environment. 

Object, Hands & Context (OHC) : The second rendering 
shows the object of interest, the users hands and surrounding 
physical objects with edges. This rendering provides addi¬ 
tional context at the expense of potential distraction from the 
virtual experience. 

Real World Windowed (RWW) : The third rendering provides 
a windowed view of the physical world, with the virtual world 


still shown in the user’s peripheral vision. The real world is 
rendered in a fully opaque window at the center of the user’s 
visual field. This rendering allows the user to focus on their 
interactions in the physical world, while still maintaining pe¬ 
ripheral cues about the virtual environment. 

Physical Picture in Picture (PPIP) : The fourth rendering 
shows the physical world as a picture in picture rendering in 
the lower right hand comer of the screen (small version of the 
virtual world), mimicking the behaviour of picture-in-picture 
televisions. This rendering allows users to interact with the 
physical world, without taking up as much screen real-estate 
as RWW. 

USER STUDY 

The purpose of this study was to elicit qualitative feedback 
about the design space of renderings in the context of dif¬ 
ferent genres of VR experiences. We specifically wanted to 
evaluate if our renderings allow users to remain immersed in 
the virtual experience while seeing parts of their physical en¬ 
vironment. We also compared our renderings to the status 
quo (baseline) solution for interacting with the physical envi¬ 
ronment while wearing an HMD, namely to remove the HMD 
entirely. 

Given the wide variety of VR experiences, we evaluated a 
spectrum of experiences that vary in the level of user engage¬ 
ment. Some VR experiences are entirely passive and require 
no input from the user (watching a movie), and other expe¬ 
riences require continuous attention and high levels of user 
input (a racing game). We hypothesize that (1) the preferred 
rendering will depend on the virtual content, changing with 
varying levels of user engagement, and (2) that users will pre¬ 
fer OH since it provides a balance between visual information 
of the physical world, without being overly distracting. 

Virtual Scenarios 

We selected rich visual experiences that are representative of 
real-world use cases in VR. We created the following experi¬ 
ences in Unity 3D: (1) watching a movie http: //sintei. org 
in a movie theater, (2) a First Person Shooter (FPS) modi¬ 
fied from UnitySD’s 3rd person AngryBots sample, and (3) a 
racing game modified from Unity3D’s Car Tutorial. 

The movie is a passive experience with no user input that 
uses a limited field of view and requires minimal head move¬ 
ment. The FPS is fast paced and requires both mouse and 
keyboard input with lots of head motion, but still contains 
natural pauses in game play for the user to interact with the 
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Figure 3. Mean ratings for each method evaluated in the user study. Participants rated ’overall satisfaction’ and 8 other factors on a 5 point likert 
scale. The errorbars represent standard deviations. OHC is highly rated for overall satisfaction in all 3 VR environments (jUoHC-Movie = 3.7, jUohc-fps 
= 3.7, MOHC-Car = 3.7). 


physical environment. The racing game is a continuous atten¬ 
tion task, where the user must constantly steer their car or risk 
crashing, with only keyboard input needed leaving one hand 
free to interact with the physical environment. 

Physical Setup 

Our system comprises of an HMD with stereo cameras and 
vision processing. We use the Oculus Rift DKl, augmented 
with 2 Logitech C310 webcams to provide a stereoscopic 
view of the real world. The lenses of the cameras were re¬ 
placed with 1.8mm lenses (http://www.thingiverse.com/ 
thing: 305355) to provide a wider FOV of approximately 120 
degrees, and then mounted in a 3D printed mount (http:/ / 
WWW. thingiverse. com/thing: 323913). Objects and hands 
were segmented using color based segmentation. Users also 
wore headphones to create a fully immersive experience. 

Participants 

We recruited 16 subjects (13 male), ages 18-24 years with 
some PC gaming experience and corrected to normal vision. 
Of our 16 participants, 6 were excluded from the study due 
to signification simulator sickness (even with brief pauses be¬ 
tween conditions). Simulator sickness is common with cur¬ 
rent HMDs 16], and only 10 participants completed the study, 
producing a total of 300 grasping trials. We expect that im¬ 
provements in display latency, resolution and refresh rate will 
decrease simulator sickness in the near future. 

Tasks and Procedure 

We designed a within subjects user study, where subjects in¬ 
teracted with 3 VR experiences in a randomized ordering us¬ 
ing a PC keyboard and mouse while seated at a table. While 
engaged in the virtual experience, subjects were externally 
prompted every 45-60 seconds to pick up a physical cup of 
water, drink the water (or simulate), and place the cup back 
on the table. This physical task was repeated twice for each 
of the 4 rendering methods and the baseline method, in a ran¬ 
domized ordering with 5 minutes of rest between renderings. 


Subjects were instructed to focus on their performance in the 
VR experience, simulating real-world conditions where users 
are highly engaged in the game/movie. Physical distractor 
objects were included on the table as well (a mobile phone, 
speakers and pieces of paper). After each physical interaction 
with the cup, the experimenter moved the cup to simulate the 
user loosing track of the physical environment during more 
realistic long term play scenarios. In total there were 30 trials 
per subject, 3 VR experiences x 2 repetitions x (4 renderings 
-E baseline). 

Between each rendering, subjects completed a questionnaire 
inspired by the core modules of 111], where they rated their 
overall satisfaction, immersion, level of distraction, ease of 
play etc. (see Figure 3). At the end of the study, subjects 
ranked the rendering methods along various dimensions (see 
Figure 4), with visual mnemonics to remind the users of each 
condition. Finally, we conducted a semi-structured interview 
with think-aloud subject feedback. 

Results 

The intra-rendering results (see Figure 3) show an over¬ 
whelming support for OHC, which was rated as the most 
preferred method by participants across all VR scenarios 
(MOHC-Movie = 3.7, /iOHC-FPS = 3.7, /iOHC-Car = 3.7). A 
Kruskal Wallis non-parametric test found significant differ¬ 
ences between visual renderings. A post-hoc Bonferroni- 
corrected Wilcoxon test on the OHC performed significantly 
better than RWW and PPIP, both in Car (Z = -2.713, p <0.01, 
Z = -2.56, p <0.01) and FPS (Z = -1.732, p <0.01, Z = -1.99, 

p <0.01). 

This result was further validated in the mean rankings anal¬ 
ysis where participants consistently ranked OHC highest in 
overall satisfaction across all VR scenario and also for each 
individual VR scenario (Figure 4). The baseline condition of 
removing the HMD was always the least preferred approach. 
However, Figure 3 illustrates a substantial pattern where Lift 
HMD, RWW and PPIP were more acceptable to participants 
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Figure 4. Post user study mean rankings, (left to right) Overall satis¬ 
faction across all VR experiences, overall satisfaction for each VR expe¬ 
rience (Movie, FPS, Car), overall distraction, ease of play and presence 
across all VR experiences. 

in Movie, eventually becoming less tolerable in the higher en¬ 
gagement scenarios (FPS & Car). Contrary to our expectation 
of OH being ranked the best method, OH was consistently 
ranked second in overall satisfaction, immersion, presence 
and distraction with high variance across users (cToH-Movie = 
1.174,(ToH-FPS = 1.247, CTOH-Car = 1.033). 

On the whole, participants qualitative feedback reflected our 
empirical flndings. With OHC, users described how the ad¬ 
ditional contextual information of their surroundings aided 
them in flnding the cup. As one user described, ‘'The extra 
lines helped me find the cup and put it back, and it wasnt dis¬ 
tracting, it didnt break my concentration on the game.” How¬ 
ever, some participants liked the minimal nature of OH, with 
one user commenting “even though I could not see the cup...I 
got used to the surroundings and the table and had a fair 
idea of where to look for the cup.” In contrast, one user who 
favored OHC noted for OH, “1 felt lost and had to feel the 
physical space around me to look for the cup.” 

The Lift HMD approach was disliked across VR scenarios 
with one participant commenting, “removing the goggles is 
immersion breaking, with [OH] and [OHC], I still felt pretty 
much in the game. ]RWW] and [PPIP] are a little more im¬ 
mersion breaking.” With RWW, one user commented, “you 
are kind of in limbo when youre doing [RWW], you might 
as well lift and do it quickly. I dont feel like I am part of 
the virtual world but I dont feel like I am in the real world.” 
When asked if participants would prefer any other location 
for rendering the preview window in PPIP, one user noted, “it 
wouldnt make any significant difference since you still have to 
concentrate on a corner which takes away your focus from the 
game.” 

DISCUSSION & FUTURE WORK 

The clear winner among our selection of visual renderings 
was to show users the object of interests and their hands while 
using edges to visualize the supporting surfaces. OHC al¬ 
lowed users to quickly re-acclimate themselves to the physi¬ 
cal environment, particularly when signiflcant head/body mo¬ 
tions disconnected users from their physical surroundings. 
Some users suggested that pausing the game would be pre¬ 
ferred, however this is only possible with non-multiplayer 
games. Future work could explore various methods for paus¬ 



ing the game experience via audio input, touch input on 
the HMD, controller input or even automatically detecting a 
user’s reaching motion. 

Designers looking to visualize aspects of the physical world 
should consider balancing the scale of the rendered objects 
with its placement in the virtual world. We found that users 
felt naturally comfortable seeing a version of the cup that was 
close in size to the actual physical cup and from their own 
ego-centric viewpoint. This was not the case with the PPIP 
technique which forced users to switch between seeing the 
game and the window while requiring additional time to ac¬ 
climate to the small sized view of the physical world. Fur¬ 
thermore, visual rendering techniques could be designed in 
the future to take advantage of unused pixels in the virtual 
environment. For example, users frequently thought that the 
dashboard of the virtual car could be used to show parts of 
the physical world where it would otherwise provide little to 
no information in the virtual experience. 

In the future, virtual reality experiences could be augmented 
to react to physical objects. For instance, new physical toys 
could be designed to act as controllers to the game (e.g., guns, 
wands, etc.). Designers could also leverage existing objects 
as weapons, or enable physical interactions with the environ¬ 
ment to affect the game. For example, drinking a glass of 
water can be used to recharge a users health in an FPS game. 

CONCLUSION 

We have explored a design space of bringing physical real 
world objects into a virtual reality experience. We se¬ 
lected four renderings from this design space and compared 
them through an empirical evaluation to understand which 
approaches maximize utility while reinforcing immersion. 
Lastly, we provide critical considerations necessary for the 
design of renderings of real world objects in virtual reality. 
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